AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
UI Element Detection

# UI Element Detection

Omniparser
MIT
OmniParser is a universal screen parsing tool capable of interpreting/converting user interface screenshots into structured formats to enhance existing LLM-based UI agents.
Image-to-Text Transformers
O
microsoft
847
1,662
Paligemma 3b Ft Waveui 896
A UI element detection model fine-tuned from PaliGemma 3B 896-resolution weights, specializing in object detection tasks
Image-to-Text Transformers English
P
agentsea
43
6
Qwen Vl Guidance
Apache-2.0
GUIChat is a multimodal model based on Visual Question Answering (VQA), capable of understanding image content and answering related questions, specifically optimized for GUI element recognition and interaction.
Text-to-Image Transformers
Q
RhapsodyAI
46
2
Paligemma 3b Ft Widgetcap Waveui 448
A vision-language model fine-tuned for object detection tasks on the WaveUI dataset, based on PaliGemma 3B 448-resolution weights
Image-to-Text Transformers English
P
agentsea
344
6
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase